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O Abstract 

In this short writing, we prove that the set of m-dissimilarity vectors of phyloge- 
netic n-trees is contained in the tropical Grassmannian Gm,n, answering a question 
of Pachter and Speyer. We do this by proving an equivalent conjecture proposed by 
Ch Cools. 

> 1 Introduction. 

T— I This article deals with the connection between phylogenetic trees and tropical geometry. 

That these two subjects are mathematically related can be traced back to Pachter and 
^ Speyer [7], Speyer and Sturmfels [H], and Ardila and Klivans p. The precise nature of 

this connection has been the matter of some recent papers by Bocci and Cools [2] and 
O Cools In particular, a relation between m-dissimilarity vectors of phylogenetic ?7,-trees 

>• with the tropical Grassmannians Qm n has been noted. 

^ Theorem 1.1 (Pachter and Sturmfels [8|). The set of 2- dissimilarity vectors is equal to 

the tropical Grassmannian Q2,n- 

This naturally raises the following question. 

Question 1.2 (Pachter and Speyer [7], Problem 3). Does the space of m-dissimilarity 
vectors lie in Qm,n for m > 3? 

The result in this article is of relevance in this direction and it is based on two papers 
of Cools 15 and Bocci and Cools [2], where the cases m = 3, m = 4 and m = 5 are 



handled. We answer Question 1.2 affirmatively for all m: 



Theorem 1.3. The set of m-dissimilarity vectors of phylogenetic n-trees is contained in 
the tropical Grassmannian Qm,n- 
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As we said, we prove Theorem [1.3 by proving an equivalent conjecture, Proposition 3.1 
of this paper, or see Conjecture 4.4 of j^. 



2 Definitions. 

2.1 The Tropical Grassmannian. 

Let IK = C{{t}} be the field of Puiseux series. Recall that this is the algebraically closed 
field of formal expressions 

oo 

k=p 

where p G Z, Cp 7^ 0, g G Z"*" and G C for all k > p. It is the algebraic closure of 
the field of Laurent series over C. The field comes equipped with a standard valuation 
val: K I— )■ Q U {00} by which val(u;) = p/q. As a convention, val(O) = 00. 

Now, let X = {xij) be an m X matrix of indeterminates and let K.[x] denote the 
polynomial ring over K generated by these indeterminates. Fix a second polynomial ring 
in ("') indeterminates over the same field: 

K[p] = K[pij,i2,,..,i„ : 1 < ii < 12 < ■ ■ ■ < im < n] 

Let 0m,n : ^[p] ^ be the homomorphism of rings taking Pij^^,,,^i^ to the maximal 

minor of x obtained from columns ii, . . . , im- 

Definition 2.1. The Pliicker ideal or ideal of Pliicker relations is the homogeneous prime 
ideal Im,n =ker(0m,ri) which consists of the algebraic relations or syzygies among the mxm 
minors of any m x n matrix with entries in K. 

For m > 3, the Pliicker ideal has a Grobner basis consisting of quadrics; a comprehen- 
sive study of these ideals can be found in Chapter 14 of the book by Miller and Sturmfels 
[H] and in Sturmfels (TU]. It is a polynomial ideal in K.[p\ and we can define its tropical 
variety in the usual way as we now recall. Let a = (^) and M = M U {00}. Consider 

/ = CaP'^lp'^l ■ ■ ■ p"2 G K[p], where ai, . . . , cTq are the a m-subsets of {1, . . . ,n} 
The tropicalization of / is given by 

trop(/) = min{val(ca) + aip^-, + a2Pa2 H ^ <^aPaa}- 

The tropical hypersurface T{f) of / is the set of points in where trop(/) attains its 
minimum twice or, equivalently, where trop(/) is not differentiable. 
We are now ready to define tropical Grassmannians. 

Definition 2.2. The tropical variety T(/m,n) = ^(/) of the Pliicker ideal Im,n is 

,n 

denoted by Qrn,n and is called a tropical Grassmannian. 
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We have the following fundamental characterization of Grn,n which is a direct applica- 
tion of a more general result P, Theorem 2.1]. 

Theorem 2.3. The following subsets o/M" coincide: 

• The tropical Grassmannian Qm,n- 

• The closure of the set {{val{ci),val{c2), ■ ■ ■ ,val{ca)) : (ci, C2, . . . , c^) G V{Im,n) ^ 

2.2 Phylogenetic Trees. 

We also treat phylogenetic trees in this paper. 

Definition 2.4. A phylogenetic n-tree is a tree which has a labeling of its n leaves with 
the set {1, . . . ,n} and such that each edge e has a positive real number w{e) associated 
to it, which we call the weight of e. 

There is also a crucial related family of trees which we now define: 

Definition 2.5. An ultrametric n-tree is a binary rooted tree which has a labeling of its 
n leaves with {1, . . . , n} and such that 

• each edge e has a nonnegative real number w{e) associated to it, called the weight 
of e 

• it is (i-equidistant, for some d > 0, i.e. the sum of the edges in the path from the 
root to every leaf is precisely d 

• the sum of the weights of all edges in the path connecting every two different leaves 
is positive. 

Particularly, note that an ultrametric tree is binary and may have edges of weight 0. 
Now, let T be a phylogenetic n-tree. Define the vector D{m,T) whose entries are the 
numbers d^r, where ex is a subset of {1, 2, . . . , ra} of size m and d^r is the total weight of the 
smallest subtree of T which contains the leaves in a. By the total weight of a tree, we 
mean the sum of the weights of all the edges in that tree. 

Definition 2.6. The vector D{m,T) is called the m- dissimilarity vector ofT. The set of 
all m-dissimilarity vectors of phylogenetic trees with n leaves will be called the space of 
m- dissimilarity vectors of n-trees. 

Definition 2.7. A metric space S with distance function : x 5 h-> M>o is called an 
ultrametric space if the following inequality holds for all x,y,z ^ S: 

d{x, z) < max{(i(x, y), d{y, z)} 

It is a well known fact that finite ultrametric spaces are realized by ultrametric trees, 
see for example jSj Lemma 11.1]. 
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2.3 Column Reductions. 

Let n > 4. Suppose we are given integers 1 < a,b < n with a ^ h and let Ca,b be the 
operator acting on Puiseux matrices for which, for any n x n matrix M, Ca,b{M) is the 
matrix obtained from M by subtracting column b to column a. We know b preserves the 
determinant, i.e. det {ca,b{M)) = det(M). For / > 1, let {cai^bi ° ■ ■ ■ ° Ca2,b2 ° Ca^^M) (^) be 
the matrix obtained from M by first subtracting column bi to column ai, then subtracting 
column 62 to column 02, and so on up to subtracting column bi to column ai. Call this 
matrix a column reduction of M if the following conditions are met: 

• l<ai,...,ai,bi,...,bi<n 

• the numbers ai, 02, . . . , a/ are pairwise different 

• whenever 1 < k < I, the number bk is different from ai, . . . , a^. 
For simplicity, we will accept M as a column reduction of itself. 



3 Main Result. 



We are now ready to prove Theorem 1.3 Cools jl] reduced it to the following statement 
which we now prove. 

Proposition 3.1 (Cools [4J, Conjecture 4.4 ). Assume n > 4. Let T be a d- equidistant 
ultrametric n-tree with root r and such that all its edges have rational weight. 

For each edge e of T , denote by h{e) the well-defined sum of the weights of all the 
edges in the path from the top node of e to any leaf below e and let ai(e), . . . , a„_2(e) be 
generic complex numbers. 

Let xp"* e IK (with i E {1, . . . ,n} and j G {1, . . . ,n — 2}) be the sum of the monomials 
aj{e)t~^^'^\ where e runs over all edges between r and i. Then, the valuation of the 
determinant of 



M 



( 1 


1 


• 1 ^ 




X2 


^(1) 






. {x^^^f 


xf^ 


(2) 
X2 


(2) 
Xn 




X2 





is equal to —D, where D is the total weight of T . 

In the course of the proof, we assume T is binary, which follows from the construction 
of Bocci and Cools [2]. Notice they start with a phylogenetic tree and then define an 
associated ultrametric from its 2-dissimilarity vector, therefore inducing an ultrametric 
tree. Here, T corresponds to certain subtrees of this induced ultrametric tree. 
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Proof. As T is binary, we know T has n leaves, n — 2 internal nodes of degree three, one 
node (the root) of degree two and 2{n — 1) edges. 

Let <T be the tree order of T with respect to r, i.e. the order on the set of nodes of 
T by which v <t w iS v lies in the path from r to w in T. Let Vi,V2, ■ ■ ■ , t'n-i be the 
n — 1 internal nodes of T numbered in such way that if Vi <t Vj, then j < i. We must 
have Vn-i — r. 

Consider an injective function a : Vi ^ ai from the set of internal nodes to the 
leaves of T so that Vi <t ai for all i with 1 < i < n — 1. Now, for each of these values of 
i, let bi be the unique leaf such that 6j ^ aj for all j with 1 < j <i, and such that Vi <t h^. 

To show the existence of a, we construct it succesively starting with a{vi), then a{v2) 
and then continuing up until we define a{vn-i)- Suppose we have already defined a(t'i), 
. . . , a{vi-i) for some i < n — 1. Consider the maximal subtree Tj of T whose root is Vi, 
i.e. Ti is the subtree below Vi. If this tree has m leaves then it has m — 1 internal nodes, 
including Vi itself. So far we haven't defined a for nodes between r and Vi but we have 
defined it for all internal nodes of Ti different from Vi. Therefore, there are exactly m — 2 
leaves of the tree Tj which have been assigned to some of vi, . . . , Vi-i under a, so there 
are 2 leaves which we can assign to Vi. a{vi) can be either one of them. Incidentally this 
also gives us the existence and uniqueness of the respective 6,. 

Now, we want to establish the equality 'Yl^Zi h{vi) = D — d. This equality is clearly 
true when T has 2 or 3 leaves, so that n = 2 or n = 3. Let now n > 4 and suppose we 
have proved the result for all trees with i leaves with i < n. Recall n is being taken as 
the number of leaves in T, which is rooted d-equidistant with root r = t'n-i- We know 
the equality holds for each of the subtrees T-y, . . . , T„_2 below . . . , i'„_2, respectively. 
Let T„_2 be (i„_2-equidistant and let T„_3 be (i„_3-equidistant. There are two cases to 
distinguish. If Vn-2 <t Vi for alH < n — 2 then Yll=\ h{vi) = {D — d — {d — dn-2)) — 
dn-2 — D — 2d hy induction, so Yl^=i h{vi) = D — d. Otherwise suppose Vn-2 and Vns 
are incomparable in <t. Then T„_2 and Tn-3 are disjoint graphs and we have 

h{vi) = Id - I h{vj) + dn-3 \ -{d- dn-3) -{d- dn-2) - dn-2 

Vi^V(T„-2) \ \vj^ViTn-3) J J 

by induction. Reordering we get 

h{vi) + h{vj) ^D-2d 

vieViT„-2) vjeV{T„-3) 

so if we add h{vn-i) = dto both sides we get our result. 
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Now consider the column reduction M* — (ca„_i^6„_^ o • • • o Ca2,b2 ° Cai,bi) (^) ^■ 
We claim that the valuation of all nonzero monomials Y[7=i ^ia{i) with cr e -S'n in the sum 

det(M*)=5] (sgn(<7)nM*,(A 

is precisely — h{vi) + o?) = —D. To see this notice for alH, 1 < i < n — 1, we have 

• Ml,^ = 

• the valuation of Mg^. is —d — h{vi) 

• the valuation of M*^. is —h{vi) if j ^ 1 and j 7^ 3 

• the only nonzero term in the first row of M* is the 1 in column hn-i 

Because of our generic choice of coefficients, we can find some monomial term in the 
sum det(M*) with valuation —D which doesn't get cancelled, so we are done. 

□ 




Figure 1: 



A rooted 10-tree. 



The injective function a 



{(i;i,l),(t;2,4),(f3,6),(w4,8),(w5,3),(w6,7),(w7,2),(i;8,9),(w9,5)} is depicted, as well 
as the equality Y^l^i h{vi) =35 — 9. 
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Example 3.2. Consider the 9-equidistant 10-tree of Figure [T] with total weight 35. The 
second row of the matrix M associated to this tree is the following vector with generic 
complex coefficients: 



,er^ + hr"^ + gt'^ + ,rt~^ + xt 



yr^ + zt'^ + qr^ 



-3 



zr* + 



Using the operator (c5,io o cg^io o C2,5 o 07,9 o 03,5 o 03,9 o cej o 04,5 o 01,2) suggested by the 
figure we obtain the column reduction M* whose second row is the vector: 



[(a - h)r^ 

- et + (c - h)t-^ 

er^ + hr^ + {g - w)t~^ + {p- q)t~ 



{s - v)r' + {x- y)r 



'1 1 

It has valuation vector: 



{b-e)t-'-ht-' + if~9)t- 



(r — s)t 



-1 



( - 1, -4, -2, -1, -9, -1, -3, -1, -4, -9) = 

( - ^(^1), -hi^r), -h{vb), -h{v2), -h{vg), -h{y^), -hivo), -hiv^), -h{vs)) 

where f 1, fy, W5, ^2, fg, fs, fe, "^4, "Ws are the preimages of 1,2,3,4,5,6,7,8,9 under a, re- 
spectively in that order. Also notice that Yll=i ^i.'^i) = 35 — 9. 

We have shown that the m-dissimilarity vector of a phylogenetic tree T with n leaves 
gives a point in the tropical Grassmannian Qm^n-, and therefore gives rise to a tropical 
linear space. The combinatorial structure of those tropical linear spaces is the subject of 
an upcoming paper [5]. 
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